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Bacteroides xlyanisolvens strains (SD_CC_lb, SD_CC_2a) isolated from human feces were grown on crystalline cellulose. Cellu- 
lolytic properties are not common in Bacteroides species. Here, we report improved genome sequences of both of the B. xlyani- 
solvens strains. 
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Sacteroides xlyanisolvens is commonly found in the human gut 
( 1 ) . The type strain, XB 1 A, has high xylanase activity and ex- 
tensively ferments xylan (2). We isolated two strains, SD_CC_lb 
and SD_CC_2a, using a substrate-depleted medium (3) with crys- 
talline cellulose as the only carbohydrate, at the USDA-ARS Na- 
tional Laboratory for Agriculture and the Environment (Ames, 
LA). Analysis of the postfermentation cellulose structure indicated 
that SD_CC_lb and SD_CC_2a might have unique degradation 
properties (data not shown). In order to further determine the 
genomic basis for these results we sequenced the whole genome. 

DNA was isolated using a DNeasy blood and tissue kit 
(Qiagen). PCR amplification of 16S rRNA genes was done with 
27F (5' AGAGGTTTGATCMTGGCTCAG 3') and 1492R (5' TA 
CGGYTACCTTGTTACGACTT 3') primers (Invitrogen) and a 
DYAD DNA Engine thermocycler (MJ Research) to determine phy- 
logentic species placement. The PCR mixture (50 /xl) contained 1 X 
Qiagen PCR buffer, 1.25 U of Taq polymerase (Qiagen), 0.25 mM of 
each deoxynucleoside triphosphate (dNTP) (Amresco), 25 pmol of 
each primer, and 80 ng of template DNA. Amplified products were 
cleaned using QIAquick 96 PCR purification (Qiagen) and se- 
quenced by the Iowa State University DNA Sequencing and Synthesis 
Facility (Ames, LA) using an ABI Prism 377 sequencer. 

We built upon an assembly using a total of 337,702 reads (215.6 
million bp and ~35-fold) for SD_CC_lb (GenBank accession no. 
SRX015718) and 245,608 reads ( 143 million bp and ~23-fold) for 
SD_CC_2a (SRX015722), generated using 454 GS-FLX and as- 
sembled using a Newbler assembler (4). A total of 6,059,812 bp in 
236 contigs (N so , 60,820 bp) and 6,050,198 bp in 305 contigs (N 50 , 
40, 148 bp) were generated for SD_CC_lb (GenBank accession no. 
ASM17821vl) and SD_CC_2a (ASM17829vl), respectively. We 
added a small insert (-374 bp) library for each strain prepared 
using the standard Illumina (HiSeq2000) protocol. After screen- 
ing for sequencing artifacts and <I>X contamination, 50.6 million 
(~l,687-fold) and 47.1 million (~l,569-fold) 100-bp paired-end 



reads were generated for SD_CC_lb and SD_CC_2a, respectively. 
The coverage calculated was based on a genome size estimate of 
~6 Mbp. Scaffolds were generated from Illumina reads using the 
ABySS (v 1.3.4) assembler (5). Ilumina and 454 assemblies were 
merged using PHRAP (6), an overlap layout consensus (OLC) 
style assembler. The SD_CC_lb assembly had a total of 6,484,037 
bp (60 scaffolds, with a GC of 42% and an N 50 of 230,871 bp), and 
SD_CC_2a had a total of 6,228,594 bp (68 scaffolds, with a GC of 
42% and an N 50 of 214,376 bp). The assemblies were greatly im- 
proved, and the number of sequences decreased from 236 to 60 in 
SD_CC_lb and 305 to 68 in SD_CC_2a. The N 50 increased from 
60,820 bp to 230,871 bp and 40,148 bp to 214,376 bp for 
SD_CC_lb and SD_CC_2a, respectively. The quality levels of 
both assemblies were assessed by mapping Illumina reads back to 
the assembly using BWA (7). High percentages of uniquely align- 
ing reads (-87% mapped back with -86% properly paired and 
-85% mapping uniquely for both strains) to the final genome in 
both strains validated the de novo assembly process. Gene predic- 
tion and annotation for both strains were performed using the 
RAST (8) server incorporating GLIMMER (9, 10). A total of 5,521 
protein-coding and 83 RNA sequences were predicted for 
SD_CC_lb and 5,328 protein-coding and 75 RNA sequences were 
predicted for SD_CC_2a. 

Nucleotide sequence accession numbers. Draft genome se- 
quences for both strains have been deposited in the European 
Nucleotide Archive, under accession numbers CBXG000000000 
(SD_CC_lb) and CBXH000000000 (SD_CC_2a). 
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